library(sf)
library(mapview)Getting demographic data
Below, we have provided instructions for downloading the data that will be used to identify underserved communities in Tampa Bay. To view instructions for cleaning the data and utilizing the demographic indices to map underserved communities, see Mapping underserved communities.
Load the required R packages (install first as needed).
To collect demographic data that will be used for identifying underserved communities, we will be downloading U.S. census data provided by the EPA’s 2022 Environmental Justice Screening Tool (EJScreen). This data is available from https://gaftp.epa.gov/EJSCREEN/2022/. Here you will find different versions of EJScreen data that are summarized, calculated, and visualized in different ways to meet your particular needs (e.g., census blocks or tracts, state or national percentiles, tabular or spatial data).
In our case, we are interested in obtaining spatial data for the supplemental demographic indices, summarized at the both the census block group and tract levels, using national percentiles as our thresholds for identifying underserved communities. Block groups are statistical divisions of census tracts, generally defined to contain between 600-3,000 people. This is the highest resolution of spatial data provided by EJScreen. Census tracts represent aggregated block groups of 1,200-8,000 people. This level is advantageous because it is the highest resolution for which the federal government provides standardized demographic, socioeconomic, and environmental data.
Data by census block group
The appropriate file to download for our requirements at the block group level is “EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI.gdb.zip”.
Download the relevant file from EJScreen. The file is downloaded to a temporary directory. Note that it is a large file (~480mb) and will take some time.
# url with zip gdb to download
urlin <- 'https://gaftp.epa.gov/EJSCREEN/2022/EJSCREEN_2022_Supplemental_with_AS_CNMI_GU_VI.gdb.zip'
# download file
tmp1 <- tempfile(fileext = ".zip")
download.file(url = urlin, destfile = tmp1)Unzip the geodatabase that was downloaded to a second temporary directory.
# unzip file
tmp2 <- tempdir()
utils::unzip(tmp1, exdir = tmp2)Read the polygon layer from the geodatabase.
# get the layers from the gdb
gdbpth <- list.files(tmp2, pattern = '\\.gdb$', full.names = T)
gdbpth <- gsub('\\\\', '/', gdbpth)
lyr <- st_layers(gdbpth)$name
# read the layer
dat <- st_read(dsn = gdbpth, lyr)To exclude blocks outside of our watershed boundary, intersect the layer with the Tampa Bay watershed. If working in a different area, you will want to replace the tbshed shapefile with your own boundary file.
load(file = 'data/tbshed.RData')
# intersect the layer with the tb watershed
tb_blockgroup <- dat %>%
st_transform(crs = st_crs(tbshed)) %>%
st_make_valid() %>%
st_intersection(tbshed)The layer can be saved as an RData object if needed. The size should be minimal (~2mb).
# save the layer as an RData object (~1mb)
save(tb_blockgroup, file = 'data/tb_blockgroup.RData')View the data using mapview. You can see that we now have the desired spatial data just for our watershed.
load(file = 'data/tb_blockgroup.RData')
mapview(tb_blockgroup)